A Nonlinear Mixture Model based Unsupervised Variable Selection in Genomics and Proteomics
نویسنده
چکیده
Typical scenarios occurring in genomics and proteomics involve small number of samples and large number of variables. Thus, variable selection is necessary for creating disease prediction models robust to overfitting. We propose an unsupervised variable selection method based on sparseness constrained decomposition of a sample. Decomposition is based on nonlinear mixture model comprised of test sample and a reference sample representing negative (healthy) class. Geometry of the model enables automatic selection of component comprised of disease related variables. Proposed unsupervised variable selection method is compared with 3 supervised and 1 unsupervised variable selection methods on two-class problems using 3 genomic and 2 proteomic data sets. Obtained results suggest that proposed method could perform better than supervised methods on unseen data of the same cancer type.
منابع مشابه
Nonlinear Sparse Component Analysis with a Reference: Variable Selection in Genomics and Proteomics
Many scenarios occurring in genomics and proteomics involve small number of labeled data and large number of variables. To create prediction models robust to overfitting variable selection is necessary. We propose variable selection method using nonlinear sparse component analysis with a reference representing either negative (healthy) or positive (cancer) class. Thereby, component comprised of...
متن کاملSteel Consumption Forecasting Using Nonlinear Pattern Recognition Model Based on Self-Organizing Maps
Steel consumption is a critical factor affecting pricing decisions and a key element to achieve sustainable industrial development. Forecasting future trends of steel consumption based on analysis of nonlinear patterns using artificial intelligence (AI) techniques is the main purpose of this paper. Because there are several features affecting target variable which make the analysis of relations...
متن کاملModel Selection for Mixture Models Using Perfect Sample
We have considered a perfect sample method for model selection of finite mixture models with either known (fixed) or unknown number of components which can be applied in the most general setting with assumptions on the relation between the rival models and the true distribution. It is, both, one or neither to be well-specified or mis-specified, they may be nested or non-nested. We consider mixt...
متن کاملAn Overview of the New Feature Selection Methods in Finite Mixture of Regression Models
Variable (feature) selection has attracted much attention in contemporary statistical learning and recent scientific research. This is mainly due to the rapid advancement in modern technology that allows scientists to collect data of unprecedented size and complexity. One type of statistical problem in such applications is concerned with modeling an output variable as a function of a sma...
متن کاملPrior Information Based Bayesian Infinite Mixture Model
Unsupervised learning methods have been tremendously successful in extracting knowledge from genomics data generated by high throughput experimental assays. However, analysis of each dataset in isolation without incorporating potentially informative prior knowledge is limiting the utility of such procedures. Here we present a novel probabilistic model and computational algorithm for semi-superv...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015